Combining linear regression models: When and how?
نویسندگان
چکیده
Model combining (mixing) methods have been proposed in recent years to deal with uncertainty in model selection. Even though advantages of model combining over model selection have been demonstrated in simulations and data examples, it is still unclear to a large extent when model combining should be preferred. In this work, firstly, an instability measure to capture the uncertainty of model selection in estimation, named PIE, is proposed based on perturbation of the sample. It is demonstrated that estimators from model selection can have large PIE values and model combining substantially reduces the instability for such cases. Secondly, we propose a model combining method, ARMS, and derive a theoretical property. In ARMS, a screening step is taken to narrow down the list of candidate models before combining, which not only saves computing time but also can improve estimation accuracy. Thirdly, we compare ARMS with EBMA (an empirical Bayesian model averaging) and model selection methods in a number of simulations and real data examples. The comparison shows that model combining produces better estimators when the instability of model selection is high and ARMS performs better than EBMA in most such cases in our simulations. With respect to the choice between model selection and model combining, we propose a rule of thumb in terms of PIE. The empirical results support that PIE is a sensible indicator of model selection instability in estimation and is useful for understanding whether model combining is a better choice over model selection for the data at hand.
منابع مشابه
New Approach in Fitting Linear Regression Models with the Aim of Improving Accuracy and Power
The main contribution of this work lies in challenging the common practice of inferential statistics in the realm of simple linear regression for attaining a higher degree of accuracy when multiple observations are available, at least, at one level of the regressor variable. We derive sufficient conditions under which one can improve the accuracy of the interval estimations at quite affordable ...
متن کاملWhich Methodology is Better for Combining Linear and Nonlinear Models for Time Series Forecasting?
Both theoretical and empirical findings have suggested that combining different models can be an effective way to improve the predictive performance of each individual model. It is especially occurred when the models in the ensemble are quite different. Hybrid techniques that decompose a time series into its linear and nonlinear components are one of the most important kinds of the hybrid model...
متن کاملLiu Estimates and Influence Analysis in Regression Models with Stochastic Linear Restrictions and AR (1) Errors
In the linear regression models with AR (1) error structure when collinearity exists, stochastic linear restrictions or modifications of biased estimators (including Liu estimators) can be used to reduce the estimated variance of the regression coefficients estimates. In this paper, the combination of the biased Liu estimator and stochastic linear restrictions estimator is considered to overcom...
متن کاملPrediction of the main caving span in longwall mining using fuzzy MCDM technique and statistical method
Immediate roof caving in longwall mining is a complex dynamic process, and it is the core of numerous issues and challenges in this method. Hence, a reliable prediction of the strata behavior and its caving potential is imperative in the planning stage of a longwall project. The span of the main caving is the quantitative criterion that represents cavability. In this paper, two approaches are p...
متن کاملThe Family of Scale-Mixture of Skew-Normal Distributions and Its Application in Bayesian Nonlinear Regression Models
In previous studies on fitting non-linear regression models with the symmetric structure the normality is usually assumed in the analysis of data. This choice may be inappropriate when the distribution of residual terms is asymmetric. Recently, the family of scale-mixture of skew-normal distributions is the main concern of many researchers. This family includes several skewed and heavy-tailed d...
متن کاملESTIMATING THE PARAMETERS OF A FUZZY LINEAR REGRESSION MODEL
Fuzzy linear regression models are used to obtain an appropriate linear relation between a dependent variable and several independent variables in a fuzzy environment. Several methods for evaluating fuzzy coefficients in linear regression models have been proposed. The first attempts at estimating the parameters of a fuzzy regression model used mathematical programming methods. In this the...
متن کامل